In this document, we will outline the Bayesian analogs of the statistical analyses described in lecture 1 (Github code).
If you use RStudio’s Visual editor mode. Check if you can see anything written in the following parenthesis: (\(k\)). If not, your RStudio has a bug. This problem will be fixed in future versions. Meanwhile, if some text seems to be missing in the file below, simply position your cursor around the missing area to reveal the underlying code.
The following function extracts results from different models and generate results of the same format to be used in visualizations
tidy.wrapper = function(model) {
if (class(model) == "lm") {
tidy(model, conf.int = TRUE) %>%
select(-c(statistic, p.value)) %>%
mutate(model = "Frequentist") %>%
select(model, everything())
} else if (class(model) == "brmsfit") {
tidy(model) %>%
filter(effect == "fixed") %>%
select(c(term:conf.high)) %>%
mutate(model = "Bayesian") %>%
select(model, everything())
} else {
stop("unknown model class")
}
}Let’s first load and take a look at the data in a table:
dataset = readr::read_csv("data/blinded.csv", col_types = "dcd")
head(dataset)| experiment | condition | effectiveness |
|---|---|---|
| 1 | no_graph | 7 |
| 1 | graph | 6 |
| 1 | no_graph | 9 |
| 1 | no_graph | 4 |
| 1 | graph | 6 |
| 1 | no_graph | 7 |
Recall from the previous lecture that this dataset contains results from four experiments. In each experiment, participants are placed in either a graph or no graph condition, and asked to rate the effectiveness of the intervention on a 9-point Likert-style question. Here, we plot the data to show the distribution of these responses:
dataset %>%
mutate(effectiveness = fct_rev(factor(effectiveness, levels = 1:9)),
experiment = as.factor(experiment)) %>%
# stacked bar plot
ggplot(aes(x = condition, fill = effectiveness)) +
geom_bar(position = "stack", stat="count") +
# plot data for different experiments as small multiples
facet_wrap(. ~ experiment) +
# grey color scale is robust for colorblind
scale_fill_brewer(palette="Purples", drop = FALSE) +
# horizontal plot
coord_flip() +
# legend
guides(fill = guide_legend(reverse = TRUE)) For the purposes of this lecture, we will focus on the first experiment.
exp1.data = dataset %>%
filter(experiment == 1)
head(exp1.data)| experiment | condition | effectiveness |
|---|---|---|
| 1 | no_graph | 7 |
| 1 | graph | 6 |
| 1 | no_graph | 9 |
| 1 | no_graph | 4 |
| 1 | graph | 6 |
| 1 | no_graph | 7 |
The first model discussed in the previous lecture was the
Wilcoxon signed rank test. This is a non-parametric test
which we will skip for now. Although, there exists Bayesian
non-parametric methods, they are too advanced for this lecture.
The second model discussed was the t-test. Although R
contains a function (t.test) to perform this analysis, the
t-test is essentially a linear regression, and thus can be
performed using the linear model function in R (lm). The
following code shows the equivalent linear-model for a paired sample
t-test:
dataset1.lm.freqt <-
lm(
effectiveness ~ condition - 1, # we remove intercept
data = exp1.data
)Before we implement a Bayesian model, let us take a look at the distribution of responses. Although we know that we are going to use a t distribution, we still plot the data to set how it looks like. Usually, this will help us decide our likelihood function.
exp1.data %>%
ggplot(., aes(x = effectiveness)) +
geom_histogram(fill = DARK_PURPLE, color = NA, binwidth = 0.5, center = 0) +
scale_x_continuous(breaks = seq(0, 10, by = 1))We can see from the above plot that the responses are discrete. This is important to keep in mind when specifying a Bayesian model.
Let’s also look how a Student’s t distribution varies different parameter. Mu is location, sigma is dispersion (which is simiilar to the standard deviation parameter of the Gaussian distribution), and nu (df) control the tail.
ggplot() + xlim(-10, 10) +
geom_function(
color = "black",
fun = function(x)
dstudent_t(x, mu = 0, sigma = 1, df = 5) ,
size = 1
) +
geom_function(
color = "gray80",
fun = function(x)
dstudent_t(x, mu = 0, sigma = 1, df = 50),
size = 1
) +
geom_function(
color = "gray60",
fun = function(x)
dstudent_t(x, mu = 0, sigma = 4, df = 5),
size = 1
) +
geom_function(
color = "gray40",
fun = function(x)
dstudent_t(x, mu = 2, sigma = 1, df = 5),
size = 1
) +
xlab("x")Next, we will implement the Bayesian analog of the linear model
described earlier. Here’s what the model formula will look like when
implemented using the bf() function from the
brm package:
dataset1.brm.bayesiant.formula <- bf(effectiveness ~ condition - 1,
family = student())Let us take a minute to understand this code. It says that we are
regressing the variable condition on
effectiveness. The - 1 removes the intercept
term. The family argument is used to specify the probability
distribution of the likelihood — in other words, what is the
distribution of \(P(y)\).
The blank ones are flat (uniform) priors. These are improper priors
and usually needed to be set. We can and should adjust the priors given
by brm.
as_tibble(get_prior(dataset1.brm.bayesiant.formula, data = exp1.data))| prior | class | coef | group | resp | dpar | nlpar | bound | source |
|---|---|---|---|---|---|---|---|---|
| b | default | |||||||
| b | conditiongraph | default | ||||||
| b | conditionno_graph | default | ||||||
| gamma(2, 0.1) | nu | default | ||||||
| student_t(3, 0, 2.5) | sigma | default |
Next, we need to specify priors for the parameters in the model. We will discuss briefly how these prior distributions were obtained.
For the prior on b which is the mean parameter of the
Student’s t distribution (used as the likelihood), an unbiased
assumption would be that they should be centered around 5 (which is the
center of the 9 possible answers), and the mean would likely be greater
than 1 and less than 9 (unless every single participant responded either
1 or 9; but we know that was not the case).
The prior on sigma determines the dispersion parameter
of the Student’s t distribution. Now, considering that our data
is bounded between 1 and 9, a uniform distribution will have the maximum
variance given these constraints. The variance of such an uniform
distribution would be: sd(runif(1e4, 1, 9) which is
approximately 2.5. We know that our data will most likely have less
variance than such a uniform distribution. Thus, we want the prior on
sigma to assign very little probability mass for values
greater than 2.5. Our prior student_t(3, 0, 1) assigns less
than 0.05 probability to values greater than 2.5 (you can check by
running the following code in the console:
sum(gamlss.dist::rTF(1e4, 0, 1, 3) > 2.5) / 1e4).
dataset1.brm.bayesiant.priors = c(
prior(normal(5, 1.5), class = "b"),
prior(student_t(3, 0, 1), class = "sigma")
)Before we implement the regression model, it is advisable to perform some prior predictive checks. Bayesian models are generative. In other words, it is possible to sample values from the prior distributions, feed them through the likelihood function to obtain a prior predictive distribution.
The prior predictive distribution should ideally resemble the data generating process, and should not assign probability mass to unlikely or impossible values. If the prior predictive distribution is assigning substantial probability mass to unlikely or impossible values, we should adjust our choice of prior distributions. Prior predictive checks also help make explicit some of the assumptions that go into the prior specification process.
Another important thing to note is that brms often
specifies improper priors (denoted by flat) by default. If
the priors are improper, we cannot sample draws from the prior
predictive distribution.
The following code block implements some prior predictive checks:
tibble(
x = seq(0, 10, by = 0.01),
y = dnorm(x, 5, 1.5)
) %>%
ggplot(aes(x, y)) +
geom_line(color = "#b8925f", size = 2) +
scale_x_continuous(breaks = seq(1, 9, by = 1)) +
labs(y = "Density") +
theme(
panel.grid.major.y = element_line(),
axis.title.y = element_text(angle = 90, size = 24),
axis.text = element_text(size = 20)
)dataset1.brm.bayesiant.priorchecks <-
brm(
dataset1.brm.bayesiant.formula,
prior = dataset1.brm.bayesiant.priors,
data = exp1.data,
backend = BRM_BACKEND,
sample_prior = "only",
file = "rds/dataset1.brm.bayesiant.priorchecks.rds"
)
n_prior_draws <- 30
# extract n = 100 draws from the prior predictive distribution
dataset1.bayesiant.yprior <-
posterior_predict(dataset1.brm.bayesiant.priorchecks, ndraws = n_prior_draws)
# the following computes the probability density for each "draw"
dim(dataset1.bayesiant.yprior) <- n_prior_draws * 123
# create a new table and assign .draw number
tibble(.value = dataset1.bayesiant.yprior,
.draw = rep(1:n_prior_draws, times = 123)) %>%
ggplot() +
geom_density(
aes(x = .value, group = .draw),
color = DARK_PURPLE,
alpha = .5,
size = .5
) +
labs(y = "", x = 'Draws from the prior predictive distribution')Although, based on the above plot, it seems like the model is
assigning some non-zero probability to impossible values (less than 1 or
more than 9), these are primarily a result of our choice of the
likelihood function — the student_t() distribution does not
allow us to set bounds on the predictive distribution. Another reason
for extreme values may be the use of a Student’s t prior for
sigma with 3 degrees of freedom — this distribution has fat tails and
thus might result in predicting large values for the standard deviation
parameter in our model.
We will revisit this point later. For now, let’s move forward, and fit the model.
Next, we compute the posterior probability distribution using
brms and Stan. Depending on the complexity of
your model, this step may take a lot of time. Our model is not so
complex, hence this will not take too long :)
dataset1.brm.bayesiant =
brm(
dataset1.brm.bayesiant.formula,
prior = dataset1.brm.bayesiant.priors,
data = exp1.data,
backend = BRM_BACKEND,
# There are many other parameters you can play with.
# Here we use default parameters to simplify the lecture.
# We save the model in the following file.
file = "rds/dataset1.brm.bayesiant.rds"
)After the compilation step of the model is complete, we can evaluate
the fit and examine the result. Similar to lm(), the first
step is to call the summary() function on the variable
which contains the model compilation result:
summary(dataset1.brm.bayesiant)## Family: student
## Links: mu = identity; sigma = identity; nu = identity
## Formula: effectiveness ~ condition - 1
## Data: exp1.data (Number of observations: 123)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## conditiongraph 6.61 0.18 6.25 6.96 1.00 3840 2692
## conditionno_graph 6.77 0.17 6.43 7.11 1.00 3211 2355
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 1.30 0.11 1.07 1.52 1.00 3172 2490
## nu 16.84 11.62 4.36 46.74 1.00 3081 2832
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
This will give us an overview of the coefficients of the various parameters, along with the 95% credible intervals. This result shows that there is substantial overlap between the credible intervals for the two conditions. Before we delve deeper into the results, it is important to run some diagnostics on the sampling process.
First, we call bayesplot to draw posterior distributions
and MCMC traces for each parameter. We want to check whether the four
chairs have mixed well. If they have, this implies that the model has
converged.
color_scheme_set(scheme = "purple")
plot(dataset1.brm.bayesiant, newpage = T)Next, we need to examine if the posterior distributions computed by our MCMC process resembles the data. Each draw from the posterior predictive distribution should roughly resemble a “hypothetical experiment” with the same experimental design. Thus, if our posterior predictive distribution looks very different from the data, it implies that something has gone awry in our model building step (Step 1).
They are many ways you can draw from the posterior predictive
distribution. Here we use posterior_predict from
brms.
dataset1.bayesiant.y <- exp1.data$effectiveness
dataset1.bayesiant.yrep <-
posterior_predict(dataset1.brm.bayesiant)
dataset1.bayesiant.yrepgroup <-
posterior_predict(dataset1.brm.bayesiant, data.frame(condition = c('graph', 'no_graph')))
head(as_tibble(dataset1.bayesiant.yrep[,1:11]))| V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | V11 |
|---|---|---|---|---|---|---|---|---|---|---|
| 6.440270 | 5.977234 | 6.900874 | 4.701375 | 7.170452 | 3.263761 | 6.592275 | 8.581487 | 4.573106 | 5.016169 | 7.093037 |
| 6.673812 | 8.328171 | 5.944403 | 7.329556 | 7.573959 | 4.559968 | 7.088508 | 7.700899 | 7.527388 | 5.979694 | 9.111188 |
| 6.680615 | 6.707617 | 9.736384 | 9.343441 | 8.087803 | 4.691732 | 5.436256 | 9.025326 | 5.613129 | 6.243677 | 5.763645 |
| 7.165171 | 3.938078 | 8.055621 | 7.618785 | 6.061229 | 4.580845 | 6.698171 | 3.310805 | 4.854779 | 4.683616 | 7.132660 |
| 5.600135 | 4.455032 | 6.197594 | 5.334783 | 8.090987 | 7.699026 | 6.578442 | 7.225055 | 6.790332 | 4.850451 | 6.231267 |
| 5.400997 | 5.004001 | 4.820987 | 7.282429 | 7.165793 | 7.385508 | 5.583039 | 8.057038 | 6.371572 | 5.565271 | 7.011921 |
We use bayesplot to perform these diagnostic tests.
First, we plot the first eight draws of posterior predictions and the
original data. Again, there are many ways to do this (and these will be
discussed in greater detail in Lecture 3).
ppc_hist(y = dataset1.bayesiant.y,
yrep = dataset1.bayesiant.yrep[1:8,],
binwidth = .5)An alternative approach would be to plot densities for each draw from the posterior predictive distribution superimposed with the density from the actual data:
ppc_dens_overlay(y = dataset1.bayesiant.y,
yrep = dataset1.bayesiant.yrep[1:30,])Finally, we look at whether our model is able to capture the summary statistics properly. Below we show the distribution of the estimated posterior mean for the two conditions, as well as the actual mean of the data in the two conditions. Since the actual mean appears to be centered around the estimated posterior distribution of the mean, we can conclude that it is roughly capturing the relevant information from the experimental data.
ppc_stat_grouped(y = dataset1.bayesiant.y,
yrep = dataset1.bayesiant.yrep,
group = exp1.data$condition, binwidth = .1)Now that we have performed model diagnostics, we would like to
interpret the results. The pertinent research question here is whether
there is a difference in the effectiveness rating between the
graph and no graph condition. We use conditional
distributions to compare the mean difference between the two conditions.
add_epred_draws() from the tidybayes package
is the most convenient way.
dataset1.bayesiant.posterior_epred <-
tibble(condition = c('graph', 'no_graph')) %>%
add_epred_draws(dataset1.brm.bayesiant,
re_formula = NA,
allow_new_levels = FALSE) %>%
ungroup()
head(dataset1.bayesiant.posterior_epred)| condition | .row | .chain | .iteration | .draw | .epred |
|---|---|---|---|---|---|
| graph | 1 | NA | NA | 1 | 6.475753 |
| graph | 1 | NA | NA | 2 | 6.745635 |
| graph | 1 | NA | NA | 3 | 6.589682 |
| graph | 1 | NA | NA | 4 | 6.552249 |
| graph | 1 | NA | NA | 5 | 6.525029 |
| graph | 1 | NA | NA | 6 | 6.525103 |
Here add_epred_draws() takes two arguments:
newdata: is used to generate predictions. This variable should describe the experimental design. In this example, our experimental design consists of two conditions, graph and no graph, which is being passed as the first argument.
object: the model fit object
We then transform the data, subtract the two conditions from each
other (using the compare_levels() function), and calculate
the credible intervals (using mean_qi()).
dataset1.bayesiant.posterior_comparison <-
dataset1.bayesiant.posterior_epred %>%
select(-c(.chain, .iteration, .row)) %>%
compare_levels(variable = .epred, by = condition)
head(dataset1.bayesiant.posterior_comparison)| .draw | condition | .epred |
|---|---|---|
| 1 | no_graph - graph | 0.1617545 |
| 2 | no_graph - graph | 0.1336147 |
| 3 | no_graph - graph | 0.2767903 |
| 4 | no_graph - graph | 0.2008134 |
| 5 | no_graph - graph | 0.0610346 |
| 6 | no_graph - graph | 0.0427875 |
dataset1.bayesiant.posterior_comparison %>%
mean_qi(.epred)| condition | .epred | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|
| no_graph - graph | 0.1579494 | -0.3061581 | 0.6378437 | 0.95 | mean | qi |
A basic way to interpret and communicate the results is to plot them. We plot this result using credible interval. In lecture 3, you will learn how to choose among these visualizations.
p3 = dataset1.bayesiant.posterior_epred %>%
ggplot(aes(x = .epred, fill = condition)) +
geom_density(
color = NA,
alpha = .5,
) +
scale_fill_manual(values = COLOR_PALETTE) +
xlab('')
p3dataset1.bayesiant.posterior_comparison %>%
ggplot(aes(x = .epred)) +
geom_density(
fill = DARK_PURPLE,
color = NA,
alpha = .5,
) +
geom_vline(aes(xintercept = 0), linetype = "dashed", color = "black") +
coord_cartesian(xlim = c(-1, 1)) +
ggtitle('no_graph - graph') +
xlab('difference in mean')dataset1.bayesiant.posterior_comparison %>%
median_qi(.epred) %>%
ggplot() +
geom_point(aes(x = .epred, y = condition), color = DARK_PURPLE, size = 3) +
geom_errorbarh(
aes(xmin = .lower, xmax = .upper, y = condition),
color = DARK_PURPLE,
alpha = .5,
size = 2,
height = 0
) +
geom_vline(aes(xintercept = 0), linetype = "dashed", color = "black") +
coord_cartesian(ylim = c(0, 2), xlim = c(-1, 1)) +
xlab('') + ylab('')dataset1.bayesiant.posterior_comparison %>%
mutate(if_greater = .epred < 0) %>%
summarise(`Pr(graph > no_graph)` = mean(if_greater))| condition | Pr(graph > no_graph) |
|---|---|
| no_graph - graph | 0.25975 |
Finally, we compare the results obtained from the Bayesian model to the frequentist estimates:
bind_rows(tidy.wrapper(dataset1.lm.freqt),
tidy.wrapper(dataset1.brm.bayesiant)) %>%
ggplot() +
geom_pointrange(
aes(
color = model,
y = estimate,
ymin = conf.low,
ymax = conf.high,
x = term
),
position = position_dodge(width = 0.2)
) +
scale_color_brewer(palette = "Set1") +
ylab('effectiveness') +
scale_y_continuous(breaks = 1:9, limits = c(1, 9)) +
coord_flip()A more appropriate to analyze ordinal data is to use ordinal logistic regression. In this section, we reanalyze the data using Bayesian ordinal logistic regression.
As always the first step is to specify the appropriate model formula:
dataset1.brm.olr.formula <-
bf(effectiveness ~ condition,
family = cumulative("logit"))Before we go further into the model building step, it is important to understand what a ordinal regression model is estimating. The figure below shows the cumulative proportion of participants who responded at least \(k\), i.e. \(Pr(y_i \leq k)\), on a Likert-style question.
# figure from: Statistical rethinking with brms, ggplot2, and the tidyverse: Second edition by A Solomon Kurz
p1 = exp1.data %>%
count(effectiveness) %>%
mutate(pr_k = n / nrow(exp1.data),
cum_pr_k = cumsum(pr_k)) %>%
ggplot(aes(x = effectiveness, y = cum_pr_k,
fill = effectiveness)) +
geom_line(color = "#b8925f") +
geom_point(colour = "#b8925f",
size = 2.5,
stroke = 1) +
scale_x_continuous("Effectiveness", breaks = 1:9) +
scale_y_continuous("Cumulative Proportion",
breaks = seq(0, 1, by = 0.2),
limits = c(0, 1)) +
theme(axis.ticks = element_blank(),
legend.position = "none")
p2 = exp1.data %>%
count(effectiveness) %>%
mutate(pr_k = n / nrow(exp1.data),
log_cum_odds = logit(cumsum(pr_k))) %>%
ggplot(aes(x = effectiveness, y = log_cum_odds,
fill = effectiveness)) +
geom_line(color = "#b8925f") +
geom_point(colour = "#b8925f",
size = 2.5,
stroke = 1) +
scale_x_continuous("Effectiveness", breaks = 1:9) +
scale_y_continuous(
"Log-Cumulative Odds",
breaks = seq(-5, 5, by = 1),
limits = c(logit(0.005), logit(0.99))
) +
theme(axis.ticks = element_blank(),
legend.position = "none")
ggarrange(p1, p2)Since cumulative proportion is a value bounded between 0 and 1, linear regression models apply the logit function to apply the following transformation \(f: [0, 1] \to (-Inf, Inf)\). Applying this transformation to the data gives the plot on the right.
Ordinal regression estimates this cumulative probability: \(Pr(y_i \leq k)\). They do this using a series of parameters \(\alpha_k\) where \(k \in \{1, 2, 3, 4, 5, 6, 7, 8, 9\}\), according to the following equation:
\[ \begin{align} log(\frac{Pr(y_i <= k)}{1 - Pr(y_i <= k)}) = \alpha_k \end{align} \] Ordinal regression estimates this cumulative probability: \(Pr(y_i \leq k)\). Once we have estimated the cumulative probability, we can take the difference between successive items, \(Pr(y_i = k) = Pr(y_i \leq k) - Pr(y_i \leq k - 1)\), to estimate the discrete probability for each outcome:
# primary data
exp1.data_plot = exp1.data %>%
count(effectiveness) %>%
mutate(pr_k = n / nrow(exp1.data)) %>%
add_row(effectiveness = 2, n = 0, pr_k = 0) %>%
arrange(effectiveness) %>%
mutate(cum_pr_k = cumsum(n / nrow(exp1.data))) %>%
mutate(discrete_probability = ifelse(effectiveness == 1, cum_pr_k, cum_pr_k - pr_k))
text = exp1.data_plot %>%
mutate(
text = effectiveness,
effectiveness = effectiveness + 0.25,
cum_pr_k = ifelse(cum_pr_k - 0.05 < 0.065, 0.05, cum_pr_k - 0.05)
)
exp1.data_plot %>%
ggplot(aes(x = effectiveness, y = cum_pr_k)) +
geom_line(aes(color = cum_pr_k), color = "#b8925f") +
geom_linerange(aes(ymin = 0, ymax = cum_pr_k), alpha = 1/2, color = "#b8925f") +
geom_linerange(aes(x = effectiveness,
ymin = ifelse(effectiveness == 1, 0, discrete_probability),
ymax = cum_pr_k),
color = "black") +
geom_point(colour = "#b8925f", size = 2.5, stroke = 1) +
# number annotation
geom_text(data = text,
aes(label = text),
size = 4) +
scale_x_continuous("Effectiveness", breaks = 1:9) +
scale_y_continuous("Cumulative Proportion", breaks = seq(0, 1, by = 0.2), limits = c(0, 1)) +
theme(axis.ticks = element_blank(),
axis.title.y = element_text(angle = 90),
legend.position = "none")The next step is to specify the prior distributions. Let us first look at the default prior distributions:
as_tibble(get_prior(dataset1.brm.olr.formula, data = exp1.data))| prior | class | coef | group | resp | dpar | nlpar | bound | source |
|---|---|---|---|---|---|---|---|---|
| b | default | |||||||
| b | conditionno_graph | default | ||||||
| student_t(3, 0, 2.5) | Intercept | default | ||||||
| Intercept | 1 | default | ||||||
| Intercept | 2 | default | ||||||
| Intercept | 3 | default | ||||||
| Intercept | 4 | default | ||||||
| Intercept | 5 | default | ||||||
| Intercept | 6 | default | ||||||
| Intercept | 7 | default | ||||||
| Intercept | 8 | default |
We notice that the difference parameter, b does not have
a proper prior, so we should specify one. The Intercept term does have a
prior, but since the distributions are going to be logit-transformed it
is difficult to intuitively determine whether these are strong or
diffuse prior distributions. This is another important question that can
be answered through prior predictive checks:
# priors
dataset1.brm.olr.priors = c(
prior(normal(0, 1), class = "b")
)
dataset1.brm.olr.priorchecks <- brm(
dataset1.brm.olr.formula,
data = exp1.data,
prior = dataset1.brm.olr.priors,
iter = 15000,
warmup = 7500,
backend = BRM_BACKEND,
sample_prior = 'only',
file = "rds/dataset1.brm.bayesiant.priorchecks1.rds"
)
dataset1.olr.yprior <- posterior_predict(dataset1.brm.olr.priorchecks, ndraws = 100)
ggplot() +
geom_histogram(aes(x = dataset1.olr.yprior),
fill = DARK_PURPLE,
alpha = .5, size = 1,
binwidth = 0.5, center = 0) +
scale_x_continuous(breaks = 1:9, limits = c(0.5, 9.5)) +
labs(x = 'Prior predictive distribution', y = "") +
theme(
axis.text.y = element_blank()
)Our prior predictive check reveals that the model is assigning most
of the probability mass to middle answer (5). This is not necessarily a
bad thing, and it might reflect how participants behave. However, it is
possible that these priors are still diffuse, as noticed by the low
number of Bulk_ESS and Tail_ESS. These may act
as another diagnostic measure, and ideally you’d want these values to be
at least greater than 1000. Let’s us try a different prior
dataset1.brm.olr.priors = c(prior(normal(0, 1), class = "b"),
prior(student_t(3, 0, 1.5), class = "Intercept"))
dataset1.brm.olr.priorchecks <- brm(
dataset1.brm.olr.formula,
prior = dataset1.brm.olr.priors,
data = exp1.data,
iter = 15000,
warmup = 7500,
backend = BRM_BACKEND,
sample_prior = 'only',
file = "rds/dataset1.brm.bayesiant.priorchecks2.rds"
)
summary(dataset1.brm.olr.priorchecks)## Family: cumulative
## Links: mu = logit; disc = identity
## Formula: effectiveness ~ condition
## Data: exp1.data (Number of observations: 123)
## Draws: 4 chains, each with iter = 15000; warmup = 7500; thin = 1;
## total post-warmup draws = 30000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept[1] -3.23 2.22 -9.39 -0.25 1.01 757 182
## Intercept[2] -1.68 1.45 -4.88 0.42 1.00 947 360
## Intercept[3] -0.83 0.97 -2.82 0.97 1.00 19222 18896
## Intercept[4] -0.26 0.91 -2.07 1.50 1.00 10572 18752
## Intercept[5] 0.26 0.91 -1.49 2.09 1.00 3883 10659
## Intercept[6] 0.83 0.99 -0.97 2.91 1.00 3430 1194
## Intercept[7] 1.62 1.24 -0.45 4.49 1.00 6057 1876
## Intercept[8] 3.40 3.13 0.21 10.12 1.00 5841 2686
## conditionno_graph -0.01 1.00 -1.96 1.95 1.00 13088 12542
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## disc 1.00 0.00 1.00 1.00 NA NA NA
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Changing the priors appear to improve Bulk_ESS and
Tail_ESS slightly. Let’s take a look at the prior
predictive distribution:
dataset1.olr.yprior <-
posterior_predict(dataset1.brm.olr.priorchecks, ndraws = 100)
ggplot() +
geom_histogram(aes(x = dataset1.olr.yprior),
fill = '#351c75',
alpha = .5,
size = 1,
binwidth = .5) +
scale_x_continuous(breaks = 1:9, limits = c(0.5, 9.5)) +
xlab('prior draws') + ylab('')For now, let us proceed with these priors.
After we’ve completed the model building phase, we can then compile the model.
dataset1.brm.olr1 =
brm(
dataset1.brm.olr.formula,
prior = dataset1.brm.olr.priors,
data = exp1.data,
backend = BRM_BACKEND,
file = "rds/dataset1.brm.olr1.rds"
)
summary(dataset1.brm.olr1)## Family: cumulative
## Links: mu = logit; disc = identity
## Formula: effectiveness ~ condition
## Data: exp1.data (Number of observations: 123)
## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept[1] -4.79 0.94 -6.96 -3.23 1.00 1762 1141
## Intercept[2] -4.20 0.78 -5.98 -2.93 1.00 2466 1865
## Intercept[3] -3.46 0.57 -4.71 -2.47 1.00 3340 2834
## Intercept[4] -2.24 0.34 -2.97 -1.60 1.00 4238 3087
## Intercept[5] -1.32 0.26 -1.83 -0.81 1.00 4691 3695
## Intercept[6] -0.30 0.23 -0.74 0.17 1.00 4264 3276
## Intercept[7] 1.28 0.27 0.77 1.81 1.00 3594 3170
## Intercept[8] 2.26 0.34 1.63 2.96 1.00 3702 2956
## conditionno_graph 0.19 0.30 -0.38 0.78 1.00 4122 2809
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## disc 1.00 0.00 1.00 1.00 NA NA NA
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Notice that we get a few error messages about divergent transitions.
Here we adjust a few sampling parameters, namely
adapt_delta and max_treedepth which are passed
to the control argument, to help the MCMC sampler and avoid
divergent transitions:
dataset1.brm.olr2 =
brm(
dataset1.brm.olr.formula,
prior = dataset1.brm.olr.priors,
data = exp1.data,
backend = BRM_BACKEND,
warmup = 1500,
iter = 2500,
control = list(adapt_delta = 0.99, max_treedepth = 15),
file = "rds/dataset1.brm.olr2.rds"
)
summary(dataset1.brm.olr2)## Family: cumulative
## Links: mu = logit; disc = identity
## Formula: effectiveness ~ condition
## Data: exp1.data (Number of observations: 123)
## Draws: 4 chains, each with iter = 2500; warmup = 1500; thin = 1;
## total post-warmup draws = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept[1] -4.86 1.08 -7.47 -3.21 1.00 2304 1698
## Intercept[2] -4.23 0.83 -6.17 -2.90 1.00 2907 2007
## Intercept[3] -3.47 0.57 -4.75 -2.47 1.00 3605 3118
## Intercept[4] -2.24 0.35 -2.95 -1.59 1.00 3962 3378
## Intercept[5] -1.32 0.27 -1.86 -0.81 1.00 4105 3240
## Intercept[6] -0.30 0.24 -0.75 0.17 1.00 4242 3291
## Intercept[7] 1.28 0.26 0.79 1.79 1.00 4248 3274
## Intercept[8] 2.25 0.33 1.63 2.94 1.00 4093 3365
## conditionno_graph 0.19 0.30 -0.41 0.78 1.00 4069 2912
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## disc 1.00 0.00 1.00 1.00 NA NA NA
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
We then perform the same model diagnostics steps that we had performed for the previous model.
plot(dataset1.brm.olr2, newpage = T, ask=FALSE)dataset1.olr.y <- exp1.data$effectiveness
dataset1.olr.yrep <- posterior_predict(dataset1.brm.olr2)
ppc_hist(y = dataset1.olr.y,
yrep = dataset1.olr.yrep[1000:1007, ],
binwidth = .5)ppc_dens_overlay(y = dataset1.olr.y,
yrep = dataset1.olr.yrep[2000:2050, ], adjust = .5)ppc_bars(y = dataset1.olr.y,
yrep = dataset1.olr.yrep, binwidth = .5)ppc_stat_grouped(y = dataset1.olr.y,
yrep = dataset1.olr.yrep,
group = exp1.data$condition)Finally we estimate the difference between the two conditions and plot the results:
dataset1.olr.posterior_epred <-
tibble(condition = c('graph', 'no_graph')) %>%
add_epred_draws(dataset1.brm.olr2,
re_formula = NA,
allow_new_levels = FALSE) %>%
ungroup()
head(dataset1.olr.posterior_epred)| condition | .row | .chain | .iteration | .draw | .category | .epred |
|---|---|---|---|---|---|---|
| graph | 1 | NA | NA | 1 | 1 | 0.0136202 |
| graph | 1 | NA | NA | 2 | 1 | 0.0144437 |
| graph | 1 | NA | NA | 3 | 1 | 0.0102796 |
| graph | 1 | NA | NA | 4 | 1 | 0.0056930 |
| graph | 1 | NA | NA | 5 | 1 | 0.0272409 |
| graph | 1 | NA | NA | 6 | 1 | 0.0041625 |
dataset1.olr.posterior_epred %>%
filter(.draw == 100) %>%
group_by(condition) %>%
rename(`probability` = .epred) %>%
ggplot(., aes(x = .category, y = `probability`, fill = condition, color = condition)) +
geom_bar(stat = 'identity', alpha = .5, color = NA, width = .3) +
geom_point(aes(group = condition)) +
geom_path(aes(group = condition)) +
facet_wrap(condition ~ ., ncol = 2) +
scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, .2)) +
scale_color_manual(values = c(DARK_PURPLE, GOLD_COLOR)) +
scale_fill_manual(values = c(DARK_PURPLE, GOLD_COLOR)) +
xlab('')dataset1.olr.posterior_epred %>%
select(-.chain, -.iteration, -.row) %>%
group_by(condition, .draw, .category) %>%
rename(probability = .epred) %>%
ungroup() %>%
group_by(.category, condition) %>%
median_qi(probability) %>%
ggplot(., aes(x = .category, y = probability, fill = condition, color = condition)) +
geom_bar(stat = 'identity', alpha = .5, color = NA, width = .3) +
geom_errorbar(aes(ymin = .lower, ymax = .upper), width = 0, size = 1) +
geom_point(aes(group = condition)) +
geom_path(aes(group = condition)) +
facet_wrap(condition ~ ., ncol = 2) +
scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, .2)) +
scale_color_manual(values = c(DARK_PURPLE, GOLD_COLOR)) +
scale_fill_manual(values = c(DARK_PURPLE, GOLD_COLOR)) +
xlab('')dataset1.olr.posterior_epred %>%
filter(.draw == 100) %>%
#arrange(.category) %>%
group_by(condition) %>%
mutate(`cumulative probability` = cumsum(.epred)) %>%
ggplot(., aes(x = .category, y = `cumulative probability`, fill = condition, color = condition)) +
geom_bar(stat = 'identity', alpha = .5, color = NA, width = .3) +
geom_point(aes(group = condition)) +
geom_path(aes(group = condition)) +
facet_wrap(condition ~ ., ncol = 2) +
scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, .2)) +
scale_color_manual(values = c(DARK_PURPLE, GOLD_COLOR)) +
scale_fill_manual(values = c(DARK_PURPLE, GOLD_COLOR)) +
xlab('')dataset1.olr.posterior_epred %>%
select(-.chain, -.iteration, -.row) %>%
group_by(condition, .draw) %>%
arrange(.category) %>%
mutate(`cumulative probability` = cumsum(.epred)) %>%
ungroup() %>%
group_by(.category, condition) %>%
median_qi(`cumulative probability`) %>%
ggplot(., aes(x = .category, y = `cumulative probability`, fill = condition, color = condition)) +
geom_bar(stat = 'identity', alpha = .5, color = NA, width = .3) +
geom_errorbar(aes(ymin = .lower, ymax = .upper), width = 0, size = 2) +
geom_point(aes(group = condition)) +
geom_path(aes(group = condition)) +
facet_wrap(condition ~ ., ncol = 2) +
scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, .2)) +
scale_color_manual(values = c(DARK_PURPLE, GOLD_COLOR)) +
scale_fill_manual(values = c(DARK_PURPLE, GOLD_COLOR)) +
xlab('')dataset1.olr.posterior_comparison <-
dataset1.olr.posterior_epred %>%
select(-c(.chain, .iteration, .row)) %>%
group_by(.category) %>%
compare_levels(variable = .epred, by = condition)
head(dataset1.olr.posterior_comparison %>%
mean_qi())| .category | condition | .epred | .lower | .upper | .width | .point | .interval |
|---|---|---|---|---|---|---|---|
| 1 | no_graph - graph | -0.0019084 | -0.0124921 | 0.0052925 | 0.95 | mean | qi |
| 2 | no_graph - graph | -0.0011591 | -0.0087963 | 0.0034439 | 0.95 | mean | qi |
| 3 | no_graph - graph | -0.0024396 | -0.0153921 | 0.0071908 | 0.95 | mean | qi |
| 4 | no_graph - graph | -0.0098904 | -0.0462694 | 0.0218328 | 0.95 | mean | qi |
| 5 | no_graph - graph | -0.0137213 | -0.0614789 | 0.0312567 | 0.95 | mean | qi |
| 6 | no_graph - graph | -0.0150607 | -0.0680144 | 0.0333193 | 0.95 | mean | qi |
dataset1.olr.posterior_comparison %>%
mean_qi(.epred) %>%
ggplot() +
geom_point(aes(y = .epred, x = .category), size = 3, color = DARK_PURPLE) +
geom_errorbar(
aes(ymin = .lower, ymax = .upper, x = .category),
width = 0,
size = 2,
color = DARK_PURPLE,
alpha = .5
) +
coord_cartesian(ylim = c(-.1, .1)) +
geom_hline(aes(yintercept = 0), linetype = "dashed", color = "gray") +
xlab('') + ylab('no_graph - graph') +
theme(axis.title.y = element_text(angle = 0, vjust = .5))The following dataset is from experiment 2 of “How Relevant are Incidental Power Poses for HCI?” (Jansen & Hornbæk, 2018). Study participants were asked to either make an expansive posture or a constrictive posture before performing a task. The experiment investigated whether posture could potentially have an effect on risk taking behavior.
First, we load the data.
dataset2 = readr::read_csv("data/exp2.csv", show_col_types = FALSE) %>%
mutate(condition = condition == 'expansive') %>%
group_by(participant)
head(dataset2)| participant | adjP10 | adjP20 | adjP30 | adjPDiff | condition | gender | index | change | subgroup | orig | BIS |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 56.50000 | 55.80000 | 58.40000 | 1.90000 | TRUE | female | 69 | 3.362832 | expansiveHigh | 56.90000 | high |
| 2 | 27.87500 | 40.14286 | 36.00000 | 8.12500 | FALSE | female | 60 | 29.147982 | constrictiveLow | 34.67262 | low |
| 3 | 61.00000 | 59.00000 | 76.50000 | 15.50000 | TRUE | female | 55 | 25.409836 | expansiveLow | 65.50000 | low |
| 4 | 24.57143 | 32.75000 | 37.85714 | 13.28571 | FALSE | female | 79 | 54.069767 | constrictiveHigh | 31.72619 | high |
| 5 | 64.71429 | 54.00000 | 41.00000 | -23.71429 | TRUE | female | 69 | -36.644592 | expansiveHigh | 53.23810 | high |
| 6 | 35.14286 | 52.40000 | 45.60000 | 10.45714 | FALSE | female | 43 | 29.756098 | constrictiveLow | 44.38095 | low |
The data has been aggregated for each participant: -
condition = TRUE indicates expansive posture, and
condition = FALSE indicates constrictive posture - The
dependent variable is change which indicates the percentage
change in risk-taking behavior. Thus, it is a continuous variable.
For the purposes of this demo, we are only concerned with these two variables. We can ignore the other variables for now.
We plot the data to give us a better picture about its distribution.
dataset2 %>%
mutate(c = as.factor(condition)) %>%
ggplot(aes(x = change)) +
geom_histogram(
aes(y = ..density..),
binwidth = 10,
fill = DARK_PURPLE,
alpha = .5,
color = 'white'
) +
geom_density(size = 1,
adjust = 1.5,
color = '#9281bf') +
geom_function(
color = "#222222",
linetype = 'dashed',
fun = function(x)
dstudent_t(x, mu = 16, sigma = 39, df = 6),
size = 1
) +
scale_x_continuous(limits = c(-200, 200)) This model is the BEST test model as described by Kruschke in the paper Bayesian estimation supersedes the t-test. In this model, \(\beta\) indicates the mean difference in the outcome variable between the two groups (in this case, the percent change in the BART scores). We fit different priors on \(\beta\) and set different weights on these priors to obtain our posterior estimate.
\[ \begin{align} y_{i} &\sim \mathrm{T}(\nu, \mu, \sigma) \\ \mu &= \alpha_{0} + \beta * x_i \\ \sigma &= \sigma_{a} + \sigma_{b}*x_i \\ \beta &\sim \mathrm{N}(\mu_{0}, \sigma_{0}) \\ \sigma_a, \sigma_b &\sim \mathrm{Cauchy}(0, 2) \\ \nu &\sim \mathrm{exp}(1/30) \end{align} \]
We translate the above specification to R code.
dataset2.brm.student.formula <- bf(change ~ condition,
sigma ~ condition,
family = student())
head(as_tibble(get_prior(dataset2.brm.student.formula, data = dataset2)))| prior | class | coef | group | resp | dpar | nlpar | bound | source |
|---|---|---|---|---|---|---|---|---|
| b | default | |||||||
| b | conditionTRUE | default | ||||||
| student_t(3, 22.6, 38.8) | Intercept | default | ||||||
| gamma(2, 0.1) | nu | default | ||||||
| b | sigma | default | ||||||
| b | conditionTRUE | sigma | default |
IMPORTANT
TL;DR: Avoid using Cauchy distributions as priors as it may result in pathological behavior. Use a Normal or Student’s t distribution as prior instead. We go into further details below.
Note that, here, the authors Jansen and Hornbaek use a \(\mathrm{Cauchy}(0, 2)\) prior on \(\sigma\) based on the recommendation of Kruschke. While in theory, this distribution can be used as a proper prior distribution, in practice it is generally a bad idea to use Cauchy distributions as priors, especially for \(\sigma\). The mean and variance of a Cauchy distribution are undefined, because it has “fat tails” (i.e. the values in its tails have high probability). As a result, when used as a prior, the sampler might end up with very small or very large values. Because the \(\sigma\) parameter can only be positive, a log-link function is used i.e. the prior distribution is defined for \(log(\sigma)\) as the log of a positive-only value spans the entire real number space. So, \(\sigma = exp(x)\) where \(x\) is a value sampled from the Cauchy prior. Because of such extreme values sampled from a Cauchy distribution, coupled with the exponential transformation, you might get values which are rounded to 0 or infinity.
dataset2.brm.student.priorchecks = brm(
dataset2.brm.student.formula,
prior = c(
prior(normal(0, 2), class = "b"),
prior(cauchy(0, 2), class = "b", dpar = "sigma"),
prior(exponential(0.033), class = "nu"),
prior(student_t(3, 0, 5), class = "Intercept"),
prior(student_t(3, 0, 1), class = "Intercept", dpar = "sigma")
),
data = dataset2,
backend = BRM_BACKEND,
sample_prior = 'only',
file = "rds/dataset2.brm.student.priorchecks.rds"
)If we look at a histogram of the prior predictive distribution, we notice that it predicts really large values. Again, this is because the \(\mathrm{Cauchy}(0, 2)\) distribution has a long tail. However, ignoring extreme values, if we calculate the 95% quantile interval of this prior prediction, we find that it is [-565.42, 614.32]. Compare this interval to the data range: [-36.64, 200.61], we see that although our predicted interval is wider, it is not too bad.
min(dataset2$change)## [1] -36.64459
max(dataset2$change)## [1] 200.6135
dataset2.yprior <- posterior_predict(dataset2.brm.student.priorchecks)
ggplot() +
geom_histogram(aes(x = dataset2.yprior),
fill = DARK_PURPLE,
alpha = .5,
size = 1,
bins = 30,
center = 0) +
labs(x = 'Prior predictive distribution', y = "") +
theme(
axis.text.y = element_blank()
)dim(dataset2.yprior) <- 4000 * 80
quantile(dataset2.yprior, probs = c(.025, .975))## 2.5% 97.5%
## -565.4208 614.8581
dataset2.brm.student = brm(
bf(change ~ condition,
sigma ~ condition,
family = student()),
prior = c(
prior(normal(0, 2), class = "b"),
prior(cauchy(0, 2), class = "b", dpar = "sigma"),
prior(exponential(.033), class = "nu"),
prior(student_t(3, 0, 5), class = "Intercept"),
prior(student_t(3, 0, 2), class = "Intercept", dpar = "sigma")
),
data = dataset2,
backend = BRM_BACKEND,
file = "rds/dataset2.brm.student.rds"
)
dataset2.brm.student$prior| prior | class | coef | group | resp | dpar | nlpar | bound | source |
|---|---|---|---|---|---|---|---|---|
| normal(0, 2) | b | user | ||||||
| b | conditionTRUE | default | ||||||
| cauchy(0, 2) | b | sigma | user | |||||
| b | conditionTRUE | sigma | default | |||||
| student_t(3, 0, 5) | Intercept | user | ||||||
| student_t(3, 0, 2) | Intercept | sigma | user | |||||
| exponential(0.033) | nu | user |
summary(dataset2.brm.student)## Family: student
## Links: mu = identity; sigma = log; nu = identity
## Formula: change ~ condition
## sigma ~ condition
## Data: dataset2 (Number of observations: 80)
## Draws: 4 chains, each with iter = 1000; warmup = 0; thin = 1;
## total post-warmup draws = 4000
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 20.95 4.80 12.02 30.97 1.00 2901 2206
## sigma_Intercept 3.36 0.20 2.98 3.75 1.00 2534 2195
## conditionTRUE -0.19 1.92 -3.98 3.55 1.00 3894 2821
## sigma_conditionTRUE 0.14 0.22 -0.30 0.58 1.00 4263 2970
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## nu 4.56 5.29 1.61 14.40 1.01 1753 1433
##
## Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
plot(dataset2.brm.student, newpage = T)dataset2.student.y <- dataset2.brm.student$data$change
dataset2.student.yrep <- posterior_predict(dataset2.brm.student)
dataset2.student.epred <- tibble(condition = c(TRUE, FALSE)) %>%
add_epred_draws(dataset2.brm.student,
re_formula = NA,
allow_new_levels = TRUE) %>%
ungroup()ppc_hist(y = dataset2.student.y,
yrep = dataset2.student.yrep[100:107, ],
binwidth = 10)ppc_dens_overlay(y = dataset2.student.y,
yrep = dataset2.student.yrep[2000:2030, ])ppc_stat_grouped(
y = dataset2.student.y,
yrep = dataset2.student.yrep,
group = dataset2$condition,
binwidth = 5
)dataset2.student.epred_comparison <-
tibble(condition = c(TRUE, FALSE)) %>%
add_epred_draws(dataset2.brm.student,
re_formula = NA,
allow_new_levels = FALSE) %>%
ungroup() %>%
select(-c(.chain, .iteration, .row)) %>%
compare_levels(variable = .epred, by = condition) %>%
rename(diff = .epred)
head(dataset2.student.epred_comparison)| .draw | condition | diff |
|---|---|---|
| 1 | TRUE - FALSE | 0.0054167 |
| 2 | TRUE - FALSE | 0.2985710 |
| 3 | TRUE - FALSE | 1.1666700 |
| 4 | TRUE - FALSE | 1.4400100 |
| 5 | TRUE - FALSE | 0.5096380 |
| 6 | TRUE - FALSE | -1.4863400 |
dataset2.student.epred_comparison %>%
select(diff) %>%
mean_qi() %>%
ggplot() +
geom_point(aes(x = diff, y = condition), size = 3, color = DARK_PURPLE) +
geom_errorbarh(
aes(xmin = .lower, xmax = .upper, y = condition),
height = 0,
color = DARK_PURPLE,
alpha = .5,
size = 2
) +
geom_vline(aes(xintercept = 0), linetype = "dashed", color = "gray") +
coord_cartesian(ylim = c(0, 2), xlim = c(-5, 5)) +
scale_y_discrete(label = c('expansive - not expansive')) +
xlab('') + ylab('')dataset2.student.sigma.epred_comparison <-
tibble(condition = c(TRUE, FALSE)) %>%
add_epred_draws(dataset2.brm.student,
dpar = "sigma") %>%
ungroup() %>%
select(-c(.chain, .iteration, .row)) %>%
compare_levels(variable = sigma, by = condition) %>%
rename(sigma.diff = sigma)
head(dataset2.student.sigma.epred_comparison)| .draw | condition | sigma.diff |
|---|---|---|
| 1 | TRUE - FALSE | 0.5262710 |
| 2 | TRUE - FALSE | 6.1196667 |
| 3 | TRUE - FALSE | 5.9907197 |
| 4 | TRUE - FALSE | 9.7824273 |
| 5 | TRUE - FALSE | -0.5522292 |
| 6 | TRUE - FALSE | -5.9265989 |
dataset2.student.sigma.epred_comparison %>%
select(sigma.diff) %>%
mean_qi() %>%
ggplot() +
geom_point(aes(x = sigma.diff, y = condition), size = 3, color = DARK_PURPLE) +
geom_errorbarh(
aes(xmin = .lower, xmax = .upper, y = condition),
height = 0,
color = DARK_PURPLE,
alpha = .5,
size = 2
) +
geom_vline(aes(xintercept = 0), linetype = "dashed", color = "gray") +
coord_cartesian(ylim = c(0, 2), xlim = c(-15, 15)) +
scale_y_discrete(label = c('expansive - not expansive')) +
xlab('') + ylab('')sessionInfo()## R version 4.1.3 (2022-03-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur/Monterey 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] cmdstanr_0.4.0.9001 knitr_1.37 bayesplot_1.9.0
## [4] ggdist_3.1.1 tidybayes_3.0.2 brms_2.16.3
## [7] Rcpp_1.0.8.2 modelr_0.1.8 broom.mixed_0.2.7
## [10] broom_0.7.12 gtools_3.9.2 forcats_0.5.1
## [13] tidyr_1.2.0 ggpubr_0.4.0 ggplot2_3.3.5
## [16] purrr_0.3.4 tibble_3.1.6 dplyr_1.0.8
##
## loaded via a namespace (and not attached):
## [1] colorspace_2.0-3 ggsignif_0.6.3 ellipsis_0.3.2
## [4] ggridges_0.5.3 markdown_1.1 base64enc_0.1-3
## [7] rstudioapi_0.13 farver_2.1.0 rstan_2.21.3
## [10] bit64_4.0.5 svUnit_1.0.6 DT_0.21
## [13] fansi_1.0.2 mvtnorm_1.1-3 diffobj_0.3.5
## [16] bridgesampling_1.1-2 codetools_0.2-18 splines_4.1.3
## [19] shinythemes_1.2.0 jsonlite_1.8.0 shiny_1.7.1
## [22] readr_2.1.2 compiler_4.1.3 backports_1.4.1
## [25] assertthat_0.2.1 Matrix_1.4-0 fastmap_1.1.0
## [28] cli_3.2.0 later_1.3.0 htmltools_0.5.2
## [31] prettyunits_1.1.1 tools_4.1.3 igraph_1.2.11
## [34] coda_0.19-4 gtable_0.3.0 glue_1.6.2
## [37] reshape2_1.4.4 posterior_1.2.1 carData_3.0-5
## [40] jquerylib_0.1.4 vctrs_0.3.8 nlme_3.1-155
## [43] crosstalk_1.2.0 tensorA_0.36.2 xfun_0.30
## [46] stringr_1.4.0 ps_1.6.0 mime_0.12
## [49] miniUI_0.1.1.1 lifecycle_1.0.1 rstatix_0.7.0
## [52] zoo_1.8-9 scales_1.1.1 vroom_1.5.7
## [55] colourpicker_1.1.1 hms_1.1.1 promises_1.2.0.1
## [58] Brobdingnag_1.2-7 parallel_4.1.3 inline_0.3.19
## [61] RColorBrewer_1.1-2 shinystan_2.6.0 yaml_2.3.5
## [64] gridExtra_2.3 loo_2.5.0 StanHeaders_2.21.0-7
## [67] sass_0.4.0 stringi_1.7.6 highr_0.9
## [70] dygraphs_1.1.1.6 checkmate_2.0.0 pkgbuild_1.3.1
## [73] rlang_1.0.2 pkgconfig_2.0.3 matrixStats_0.61.0
## [76] distributional_0.3.0 evaluate_0.15 lattice_0.20-45
## [79] labeling_0.4.2 rstantools_2.1.1 htmlwidgets_1.5.4
## [82] cowplot_1.1.1 bit_4.0.4 tidyselect_1.1.2
## [85] processx_3.5.2 plyr_1.8.6 magrittr_2.0.2
## [88] R6_2.5.1 generics_0.1.2 DBI_1.1.2
## [91] pillar_1.7.0 withr_2.5.0 xts_0.12.1
## [94] abind_1.4-5 crayon_1.5.0 car_3.0-12
## [97] arrayhelpers_1.1-0 utf8_1.2.2 tzdb_0.2.0
## [100] rmarkdown_2.13 grid_4.1.3 callr_3.7.0
## [103] threejs_0.3.3 digest_0.6.29 xtable_1.8-4
## [106] httpuv_1.6.5 RcppParallel_5.1.5 stats4_4.1.3
## [109] munsell_0.5.0 bslib_0.3.1 shinyjs_2.1.0